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ABSTRACT 

The average Kul Iback-Keibler (K-L) information index 
(H* Chang and Z* Ying, in press) is a newly proposed statistic in 
Computerized Adaptive Testing (CAT) item selection based on the 
global information function* The objectives of this study were to 
improve understanding of the K-L index with various parameters and to 
compare the performance of the K-L index with the traditional 
information method in CAT item selection* The results of this study, 
based on simulated and real data with 500 items each, provide 
evidence that Chang and Ying's global information method produced 
similar or better true ability theta estimates than the more 
traditional information approach in CAT item selection* In addition, 
results from the real item pool analyses indicate the parameter that 
provides the best theta estimates among the four K-L indices studied* 
(Contains one table, seven figures, and six references*) 

(Author/SLD) 
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Abstract 



The average Kullback-Leibler (K-L) information index (Chang & Ying, in 
press) is a newly proposed statistic in CAT item selection based on the global 
information fxmction. The objectives of this study are; (1) to better 
understand the performcince of the K-L index with different 6„ values; and (2) 
to compare the performance of the K-L index with the traditional information 
method in CAT item selection. The results of this study, based on both 
simulated and real data, provide evidence that Chang and Ying's global 
information method produces similar or better 6 estimates than the more 
traditional information approach in CAT item selection. In addition, results 
from the real item pool analyses indicate that the K-L index having 6^ = S/Vn 
produces the best B estimates among the four K-L indices studied. 



A Comparison of the Traditional Maximum Information Method and 
the Global Information Method in CAT Item Selection 
computerized adaptive testing (CAT) has become popular because this 
method of testing can provide the same level of measurement precision as 
conventional paper and pencil testing with the administration of fewer items. 
This test length reduction advantage provided by CAT is achieved by 
administering a tailored test to each examinee (Lord, 1971, 1980) . CAT 
selects items that best match an examinee's ability level. The most popular 
item selection method used in CAT is the maximum item information method; The 
(n+l)th item selected for an examinee is the one which provides the maximum 
information at the examinee's estimated ability (d„) based on the n items 
previously administered to that examinee. The item information function is 
defined in Lord (1980) and is a measure of information in the neighborhood of 
the 0 of interest. Therefore, it is referred to as local information. When 

the estimated ability is not close to the examinee's true ability (0 q)/ the 

6 

item which has maximum local information at may not be the most appropriate 
item to administer to the examinee having true ability 6q. This will often 
occur at the early stages of the CAT. 

Chang (1995) and Chang and Ying (in press) proposed the use of an 
alternative item selection method, which they call the global information 
method, to improve item selection at the early stages of a CAT. The global 
information function was derived by applying the Kullback-Leibler (K-L) 
information function in the IRT context. The Kullback-Leibler item 
information function is defined as the expected value of the likelihood ratio, 
or the likelihood function at the true ability {6q) divided by the likelihood 
function at any other ability level B for an item. It has been shown that the 
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likelihood ratio statistic is the most powerful statistic to distinguish 
from any other $ value (Mood, Graybill, & Boes, 1985) . 

Chang and Ying established the connection between the local and global 
information functions: The second derivative of the global item information 

function is the traditional item information function. Geometrically, the 
traditional information at ^ is the curvature of the global information 

function at 0 Based on the relationship betv;een the global and local 

information functions, Chang and Ying developed an information inde:,^, the 
average K-L information index, for CAT item selection. The average K~L 
information index can be defined as: 



= f Kj (61 !6jd0 , (1) 

where is the maximum likelihood estimator of based on n items 
administered to an examinee, and Kj (fl| |5n) is defined as 






The parameter in (1) generates a sequence which converges to 0 as n 
increases. The controls the width of the interval under the K-L information 
index curve. This interval is expected to contain the true ability parameter, 
^ 0 / and will narrow down to the neighborhood of for an examinee as the 
number of items administered to the examinee increases and as approaches 
^ 0 * As Chang and Ying pointed out, for a small value, the maximum area 
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under the K-L curve is equivalent to the maximum curvature^ which is the 
maximum value of the traditional (local) information. For a large value ^ 
the area is very much influenced by the global information. Therefore, 6„ is 
an important parameter in the average K-L index and warrants study under 
different conditions. 

Chang and Ying conducted simulation studies to compare the performance 
of the local and global information functions in CAT item selection. They 
concluded that the global information index outperformed the traditional 
maximum information approach in most of the conditions in their simulation 
studies. In their study, the values used in (1) were 2A^n and 1/n. 

However, the effect of these two values on item selection were not directly 
compared because these values were used separately in two different simulation 
studies by Chang and Ying. Also, the test length used in Chang and Ying's 
study was fixed at 40. The performance of the average K-L index on shorter 
tests has not been investigated. In addition, in Chang and Ying's study, the 
simulated item parameters were uniformly distributed, which is often not the 
case in practice. Therefore, more extensive simulation work need to be 
carried out to compare the global information method with the traditional 
information method. 

Objectives 

Because the average K-L information index is a newly proposed statistic, 
the behavior of this statistic has not been studied extensively using either 
simulation techniques or empirical data. The purposes of this study were: 

(1) To better understand the performance of the average K-L index with 
different 6„ values; and (2) To compare the performance of the global 
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information method with the conventional information method in CAT item 
selection using both real and simulated data. 

Methods 

1 . Data 

Both real and simulated data were used in the study. Five hundred 
Structure cuid Written Expression items from the Test of English as a Foreign 
Language (TOEFL) were used to form an item pool. In addition, five hundred 
simulated items were included in a simulated item pool. The three parameter 
logistic (3~PL) model was used in both the real and simulated conditions. For 
the simulated items, the values of the item discrimination parameter (a) in 
the 3-PL model were generated from a lognormal distribution (LOG (a) - 
N(0,0.5^)). The values of the item difficulty parameter (b) in the 3-PL were 
generated from a N(0,2^) distribution. Finally, the values of the c parameter 
in the 3 -PL were generated from a beta distribution with of = 4 and /S = 13 
(The mean of the beta distribution = 0.24 and the distribution has a weight 
equivalent to 15 observations of the responses of examinees of very low 
ability) . The distributions of the item discrimination and item difficulty 
parameters used in this study are also used as the default prior distributions 
in the computer program BILOG (Mislevy & Bock, 1990) and represent common data 
structures in practice. In order to make the simulated data to be more 
representative of the real data, items having b parameters greater than 3.0 
and less than -3.0 or items having a parameters greater than 2.5 were 
eliminated. The beta distribution used to generate the pseudo-guessing or c 
parameters provides values similar to those observed in large scale testing 
programs such as TOEFL. The summary statistics for the item parameters in the 
real and the simulated item pool are presented in Table 1. 
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Insert Table 1 about here 



2 . True abilities 

The following six true abilities (^os) were used in the study: 

-3.0, -2.0, -1.0, -1.0, 2.0, 3.0. These values cover the ability range 
typically observed in practice. 

3 . Test length 

The test lengths used were 20 for the real data case and 30 for the 
simulated data case. 

4 . lo used in the average K-L information index 

The following four 6„ values were used in the real data case; 3A^n, 
1/n, l/e°-^**', and 3/e°*^**'. 

The convergence rate (to zero) of these four values as a function of 
n are presented in Figure 1. 



Insert Figure 1 about here 



Figure 1 illustrates that the four sequences converge to 0 at 
different rates. As mentioned earlier, Chang and Ying pointed out that 
maximum curvature xinder K-L curve is the maximum value of the traditional 
information when the value is small. The shifting of K-L to the 
traditional information occurs when the 6^ sequence approaches to 0. In other 
words, the K-L indices associated with each of the four sequences shift 
from global information to local information (traditional information) at 
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different rates. For example, the K-L index having 6^ = l/n shifts from 
global to local information around the administration of the 15th item; the K- 
L index having = l/e®-^*" shifts from global to local information around the 
administration of the 25th item; and the K-L index having = 3/e®*^*” shifts 
from global to local information arv. nd the administration of the 35 item. 

The K-L index having = 3A^n clearly converges to 0 much slowly than the 
other three 6^ values. 

In summary, there were 30 experimental conditions in the real data case: 
6 true 6 values (-3.0, -2.0, -1.0, 1.0, 2.0, eind 3.0) x 4 values (3A^n, 
l/n, and l/e®-^*", and 3/e®'^*") for the global information method plus the 6 true 
6 values for the conventional information method. Because results from the 
real data pool analyses indicate that the K-L index having = 3/V"n produced 
the best 6 estimates among the four K-L indices, only the K-L index having 
= 3/v/n were used to select items from the simulated item pool. There were 12 
conditions in the simulated data case: the 6 true 6 values for both the K-L 
index and the conventional information method. The test length for the real 
item pool was 20 and for the simulated item pool, 30. One h"ondred 
replications were conducted at each of the six ability levels for each of 
ther^e experimental conditions . 

5 . . \nalvses 

The bias and mean squared error at each of the six ability levels for 
• each of the experimental conditions were compared. 

The bias for the 6 estimate after item i is administered to examinee j 
having true ability 6 q is defined as 
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100 



and mean squared error for the B estimate after item i is administered to 
examinee j having true ability Oq is defined as 



where is the estimate for 6q obtained using either the global information 
method or the conventional information method, and i = 3, 20 or 30. The 

Bias and MSB for the first two items were not computed because these items 
were not selected by the information methods . 



Figures 2 and 3 present the bias and MSB at the six true 6 values for 
the K-L index and the conventional information index (INFO) . Each of the four 
values for the K-L index and INFO are presented in each of these figures. 

It should be noticed that the vertical scales for the plots presented in 
Figures 2 and 3 are not the same. The different vertical scales used here 
provide a better comparison for the different indices so that the differences 
among the indices can be clearly observed. It can be seen in Figure 2 that 
the K-L index having different values have similar or smaller bias in 
estimating the true Bs than INFO. It appears that the K-L index having = 
3/Vn outperforms the K-L indices having other 6^8. In addition, the K-L index 
having 6 ^ = S/v'n has much smaller bias than INFO when B = -3.0 and B = 2.0. 

It also has a much smaller MSB than INFO at four out of six true B values : B « 
-3.0, 0 -2.0, B =5 -1.0, and 0 = 2.0. It has similar MSB as INFO at the 
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Results 



Real Data 



10 



other two true B values. Figure 1 illustrates that 6^ = 2/Vn has the slowest 
convergence rate of the four values studied. In other words, the 
transformation from K-L to INFO for this index is much slower than for the 



other K“L indices. The better performance in terms of bias and MSB for this 
index may be due to this slow transformation from K-L to INFO. 

When ^0 = 3.0, the bias and MSB statistics slightly increase when more 
items are selected for all the indices studied. One explanation might be that 
the total number of items selected is too small (n = 20) to produce a stable 
estimate at « 3.0. This phenomenon was not observed in the simulated data 
case when the number of items selected was 30. 

Simulated Data 

The comparisons between the K-L and INFO using simulated data are 
summarized in Figures 4 and 5. Because results from the real data pool 
analyses indicate that the K-L having = 3/Vn produced the best 6 estimates 
among the four K-L indices, only the K-L index having = 2/Vn were used to 
select items from simulated item pool. All plots, except the last one, in 
Figure 4 share a common vertical scale to enable the comparison of the 
performance of these two indices at different true B values. The vertical 
scale for = 3.0 is larger than the other five plots in Figure 4 to cover 
the extreme values. All plots in Figure 5 are on a common scale. 

When ^0 = -3.0, INFO has smaller bias and MSB than K-L. When true B is 
between -2.0 and 1.0, K-L outperforms INFO in terms of both bias and MSB. 

When B z 2.0, the performance of the two indices are very similar. 

The performance of INFO and K-L when 10, 15, 20, 25, or 30 items are 
selected can also be compared across -3.0 < < 3.0. It can be seen in 
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Figures 6 and 7 that K-L has smaller bias and MSE when -2.0 < < 2.0 and 

when the length of CAT ranges from 10 to 30. 

Conclusions 

The results of this study, based on both simulated and real data, 
provides evidence that Chauig and Ying's global information method produces 
similar or better B estimates than the more traditional information approach 
in CAT item selection. Because the present study is exploratory, further 
studies need to be carried out. For example, different 6„ values in the K-L 
index need to be further compared using simulation technique to confirm the 
results fond in this study, namely, that = 3A^n produces better results 
than other 6^ values. In addition, the performance of K-L and INFO for other 
$0 values, such as Bq = -2.5, -1.5, 1.5, or 2.5 also need to be studied and 
compared. 

In conclusion, the K-L index appears to be a very promising statistical 
index in CAT item selection. However, the K-L index is more computationally 
intense than INFO because integration is involved in the calculations. 

Another study that might be done would look at ways to make the K-L index more 
computationally efficient. 
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Table 1 



Descriptive Statistics for the Real and the Simulated Item Pools 

(N » 500) 







Mecui 


SD 


Min 


Max 


REAL 


A 


1.29 


0.44 


0.20 


2.35 


B 


0.17 


0.66 


-0.99 


2.74 


C 


0.21 


0.13 


0.00 


0.70 


SIM 


A 


1.09 


0.49 


0.27 


2.44 


B 


1 

o 

o 

to 

1 .. 


1.55 


-3.00 


2.97 


C 


0.24 


0.10 


0.02 


0.62 
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Figure 1 



The Four 6„ Sequences in the K-L Information Index 




•“3/rt(n) ^[/n «^l/e(.l*n) ^3/e(.l*n) 
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Real Data 



Figure 2 Bias at Six $o 

BIAS. LENGTH=20 
TRUETHETA=-3.0 



BIAS.LENGTH«20 

TRUETHCTA=-2.0 




BIAS. LENCTH=20 
TRUETHETA=-1.0 



B1AS.LENGTH=20 

TRUEniETA=1.0 





^3/sqrl(n) l/exp(0.1*n) 3/exp(0.1*n) 

1/n info 



►^3/sqii(n) ^ l/exp(0.1*n) 3/exp(0.1*n) 

^ l/n info 



BIAS. LENGTH=20 
TRUETHETA=2.0 



BIAS.LENGTH=20 

TRUEraCTA=3.0 





— 3Aqrl(„) *«l/.jtp(0.1*n) *-3/eJtp(0.1*n) 

l/n — info ' 
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Figure 3 MSE at Six 0 ^ -- Real Data 



MSE. LENGTH==20 
TRUETHSTA=-3.0 



USE, L£NGTH=:20 
TRUETHETAs-2.0 
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^ 1/n 



i/exp(0,l*n) 

info 
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^ 1/n info 
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l/n — • info 




♦^3/sqrl(n) l/exp(0.1*n) 3/exp(0.1*n) 

1/n — info 
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Figure 4 Bias at Six 

BUS,LENGTH=30 

TRUETHETA=“3.0 



Simulated Data 

BIAS. LENGTH=^30 
TRUE THETA= -2.0 





BIAS. LENGTH=30 
TRUETHETA=-1.0 



BIAS.1JENGTH=30 
TRUE THETA= 1.0 





BIAS. LENGTH=30 
TRUETHETA=2.0 



B1AS.1JENGTH=30 

TRUETHETA=3.0 
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Figure 5 MSE at Six 

USE. LENr,TH=30 
TRUETflETA=-3.0 



Simulated Data 

MSE. LENGTH=30 
TRUE THETA= -2.0 





MSE. LENGTH=30 
TRUE THETA=- 1.0 



MSE. LENGTH=30 
TRUETHETA=I.O 
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MSE. LENGTH=30 
TRUETHETA=3.0 
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Figure 6 Bias at Different Number of Items Selected 

BlAS.ITEM=iO BlAS,rrEM=i5 
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^K-L INFO 
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INFO 




THETA 

*^K-L ““ INFO 



B1AS,ITE«=30 
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Figure 7 MSE at Different Number of Items Selected 
USE,ITEM=10 «SE.1TEM=15 





USE, ITEM=30 MSE, irEM=25 
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